Inverse Reinforcement Learning Under Noisy Observations (Extended Abstract)
نویسندگان
چکیده
We consider the problem of performing inverse reinforcement learning when the trajectory of the expert is not perfectly observed by the learner. Instead, noisy observations of the trajectory are available. We generalize the previous method of expectation-maximization for inverse reinforcement learning, which allows the trajectory of the expert to be partially hidden from the learner, to incorporate an observation model.
منابع مشابه
Inverse Reinforcement Learning Under Noisy Observations
We consider the problem of performing inverse reinforcement learning when the trajectory of the expert is not perfectly observed by the learner. Instead, a noisy continuoustime observation of the trajectory is provided to the learner. This problem exhibits wide-ranging applications and the specific application we consider here is the scenario in which the learner seeks to penetrate a perimeter ...
متن کاملA Spiking Neural Network Model of Model-Free Reinforcement Learning with High-Dimensional Sensory Input and Perceptual Ambiguity
A theoretical framework of reinforcement learning plays an important role in understanding action selection in animals. Spiking neural networks provide a theoretically grounded means to test computational hypotheses on neurally plausible algorithms of reinforcement learning through numerical simulation. However, most of these models cannot handle observations which are noisy, or occurred in the...
متن کاملMulti-Robot Inverse Reinforcement Learning Under Occlusion with State Transition Estimation
Multi-robot inverse reinforcement learning (mIRL) is broadly useful for learning, from passive observations, the behaviors of multiple robots executing fixed trajectories and interacting with each other. In this paper, we relax a crucial assumption in IRL to make it better suited for wider robotic applications: we allow the transition functions of other robots to be stochastic and do not assume...
متن کاملPreference elicitation and inverse reinforcement learning
We state the problem of inverse reinforcement learning in terms of preference elicitation, resulting in a principled (Bayesian) statistical formulation. This generalises previous work on Bayesian inverse reinforcement learning and allows us to obtain a posterior distribution on the agent’s preferences, policy and optionally, the obtained reward sequence, from observations. We examine the relati...
متن کاملPreference Elicitation and Inverse Reinforcement Learning
We state the problem of inverse reinforcement learning in terms of preference elicitation, resulting in a principled (Bayesian) statistical formulation. This generalises previous work on Bayesian inverse reinforcement learning and allows us to obtain a posterior distribution on the agent’s preferences, policy and optionally, the obtained reward sequence, from observations. We examine the relati...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017